Skip to content

Cache frontend build across setup runs#4404

Open
danielhanchen wants to merge 5 commits into
mainfrom
fix/cache-frontend-build
Open

Cache frontend build across setup runs#4404
danielhanchen wants to merge 5 commits into
mainfrom
fix/cache-frontend-build

Conversation

@danielhanchen
Copy link
Copy Markdown
Member

Summary

  • Skip frontend npm install + npm run build if frontend/dist/ already exists
  • Previously setup.ps1 deleted node_modules and package-lock.json on every run, forcing a full rebuild every time unsloth studio setup was called
  • setup.sh also always rebuilt even when dist/ was present from a previous run (it only skipped for non-editable pip installs inside site-packages/)
  • To force a rebuild, users can delete frontend/dist/ and re-run setup

Changes

setup.ps1:

  • Add elseif (Test-Path $DistDir) check to skip build when dist/ exists
  • Remove the node_modules and package-lock.json deletion on every run

setup.sh:

  • Simplify the skip condition from site-packages/* && dist/ exists to just dist/ exists
  • Works for all install paths: PyPI, editable, git clone with prior build

Test plan

  • First unsloth studio setup on a fresh git clone builds frontend as before
  • Second unsloth studio setup skips the frontend build with a message
  • Deleting frontend/dist/ and re-running triggers a fresh build
  • pip install unsloth (non-editable) still skips correctly

Skip the frontend npm install + build if frontend/dist already exists.
Previously setup.ps1 nuked node_modules and package-lock.json on every
run, and both scripts always rebuilt even when dist/ was already present.

On a git clone editable install, the first setup run still builds the
frontend as before. Subsequent runs skip it, saving several minutes.
To force a rebuild, delete frontend/dist and re-run setup.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1f075a932b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread studio/setup.sh
echo "✅ Frontend pre-built (PyPI) — skipping Node/npm check."
# Skip frontend build if dist/ already exists (PyPI wheel, previous build, etc.)
# To force a rebuild, delete frontend/dist and re-run setup.
if [ -d "$SCRIPT_DIR/frontend/dist" ]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rebuild frontend when source code changes

This new guard skips the entire frontend build whenever frontend/dist already exists, so a developer who runs setup.sh, pulls new frontend changes, and runs setup again will keep serving stale assets from the previous commit. That can produce UI/backend contract mismatches (new backend endpoints or payloads with old JS bundle) even though setup reports success; the previous logic rebuilt for editable/source installs specifically to avoid this upgrade path regression.

Useful? React with 👍 / 👎.

Instead of only checking if dist/ exists, compare source file timestamps
against the dist/ directory. If any file in frontend/src/ is newer than
dist/, trigger a rebuild. This handles the case where a developer pulls
new frontend changes and re-runs setup -- stale assets get rebuilt
automatically.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e36f09664

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread studio/setup.sh Outdated
_NEED_FRONTEND_BUILD=true
if [ -d "$SCRIPT_DIR/frontend/dist" ]; then
# Check if any src/ file is newer than dist/
_newest_src=$(find "$SCRIPT_DIR/frontend/src" -type f -newer "$SCRIPT_DIR/frontend/dist" 2>/dev/null | head -1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Include non-src files in frontend rebuild check

This freshness probe only inspects frontend/src, but the bundle also depends on other tracked inputs (frontend/public/*, package.json, vite.config.ts, etc.). If a user pulls a commit that changes one of those files without touching src, _NEED_FRONTEND_BUILD remains false and setup skips npm install/npm run build, leaving stale frontend/dist assets from the previous checkout.

Useful? React with 👍 / 👎.

Comment thread studio/setup.sh
Comment on lines +55 to +56
if [ "$_NEED_FRONTEND_BUILD" = false ]; then
echo "✅ Frontend already built and up to date -- skipping Node/npm check."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Decouple oxc-validator install from frontend cache gate

This condition now controls the whole Node block, including the later run_quiet "npm install (oxc validator runtime)" call in the same else block. On reruns where dist is considered up to date, setup skips reinstalling backend/core/data_recipe/oxc-validator dependencies, so dependency updates in that package are not applied after a pull.

Useful? React with 👍 / 👎.

Address review feedback:

- Check entire frontend/ directory for changes, not just src/.
  The build also depends on package.json, vite.config.ts,
  tailwind.config.ts, public/, and other config files. A change
  to any of these now triggers a rebuild.
- Move oxc-validator npm install outside the frontend build gate
  in setup.sh so it always runs on setup, matching setup.ps1
  which already had it outside the gate.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 153c203a89

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread studio/setup.sh
# ── oxc-validator runtime (always install, independent of frontend build) ──
if [ -d "$SCRIPT_DIR/backend/core/data_recipe/oxc-validator" ]; then
cd "$SCRIPT_DIR/backend/core/data_recipe/oxc-validator"
run_quiet "npm install (oxc validator runtime)" npm install
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Install Node before unconditional oxc-validator npm step

This npm install now runs even when _NEED_FRONTEND_BUILD=false, but the Node/npm bootstrap is only executed inside the frontend-build path. On environments that have prebuilt frontend/dist but no Node (a common pip/wheel setup path), setup now reaches this command and exits with 127 (npm: command not found) instead of provisioning prerequisites.

Useful? React with 👍 / 👎.

Comment thread studio/setup.sh Outdated
# and upgrades/pulls (source newer than dist/ triggers rebuild).
_NEED_FRONTEND_BUILD=true
if [ -d "$SCRIPT_DIR/frontend/dist" ]; then
_changed=$(find "$SCRIPT_DIR/frontend" -maxdepth 1 -name "*.json" -o -name "*.ts" -o -name "*.js" -o -name "*.mjs" | \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Track HTML entrypoint in frontend freshness check

Fresh evidence for the earlier non-src rebuild concern: this allowlist only matches *.json, *.ts, *.js, and *.mjs, so changes to other top-level build inputs like frontend/index.html are ignored. In that case _NEED_FRONTEND_BUILD remains false and setup reuses stale frontend/dist even though Vite's entry HTML changed.

Useful? React with 👍 / 👎.

Comment thread studio/setup.sh Outdated
xargs -r find -newer "$SCRIPT_DIR/frontend/dist" 2>/dev/null | head -1)
if [ -z "$_changed" ]; then
_changed=$(find "$SCRIPT_DIR/frontend/src" "$SCRIPT_DIR/frontend/public" \
-type f -newer "$SCRIPT_DIR/frontend/dist" 2>/dev/null | head -1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Replace pipefail-sensitive head -1 freshness probe

Because the script runs with set -euo pipefail, this find ... | head -1 command substitution can fail with exit code 141 when many files are newer than dist (upstream find gets SIGPIPE after head exits). That aborts setup before it can decide to rebuild, so large pulls/rebases can fail nondeterministically during the freshness check.

Useful? React with 👍 / 👎.

- setup.sh: Replace broken `find | xargs find -newer` pipeline with
  single `find ... -newer` call. The old pipeline produced "paths must
  precede expression" errors (silently suppressed by 2>/dev/null),
  causing top-level config changes to never trigger a rebuild.
- setup.sh: Add `command -v npm` guard to oxc-validator block so it
  does not fail when Node was not installed (build-skip path).
- setup.ps1: Replace `Get-ChildItem -Include` (unreliable without
  -Recurse on PS 5.1) with explicit directory paths for src/ and
  public/ scanning.
- Both: Add *.html to tracked file patterns so index.html (Vite
  entry point) changes trigger a rebuild.
- Both: Use -print -quit instead of piping to head -1 for efficiency.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 21e549c069

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread studio/setup.sh
Comment on lines +56 to +57
_changed=$(find "$SCRIPT_DIR/frontend/src" "$SCRIPT_DIR/frontend/public" \
-type f -newer "$SCRIPT_DIR/frontend/dist" -print -quit 2>/dev/null)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Guard recursive find against missing frontend dirs

With set -euo pipefail, the find "$SCRIPT_DIR/frontend/src" "$SCRIPT_DIR/frontend/public" ... command substitution exits the whole setup when either directory is absent and no earlier match triggers -quit. That makes cached runs fail in installs that omit one of these folders (for example, slimmed package layouts), even though the intended behavior is to treat “no tracked changes” as a cache hit. Build freshness probing should only pass existing paths to find (or otherwise neutralize missing-path exit codes).

Useful? React with 👍 / 👎.

@danielhanchen
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces caching for the frontend build in both setup.ps1 and setup.sh to speed up subsequent setup runs. The implementation correctly checks for the existence and modification time of the frontend/dist directory against source files. The logic is sound and efficiently implemented. Additionally, the setup.sh script refactors the oxc-validator installation to be independent of the frontend build, which is a good improvement. I have one minor suggestion for setup.sh to make a file search more precise.

Comment thread studio/setup.sh
Comment on lines +51 to +53
_changed=$(find "$SCRIPT_DIR/frontend" -maxdepth 1 \
\( -name "*.json" -o -name "*.ts" -o -name "*.js" -o -name "*.mjs" -o -name "*.html" \) \
-newer "$SCRIPT_DIR/frontend/dist" -print -quit 2>/dev/null)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The find command here is missing the -type f predicate. Without it, the command could potentially match a directory with a name like config.js. While unlikely, adding -type f would make the check more precise by ensuring only files are considered, which aligns with the intent of checking for changes in configuration files.

Suggested change
_changed=$(find "$SCRIPT_DIR/frontend" -maxdepth 1 \
\( -name "*.json" -o -name "*.ts" -o -name "*.js" -o -name "*.mjs" -o -name "*.html" \) \
-newer "$SCRIPT_DIR/frontend/dist" -print -quit 2>/dev/null)
_changed=$(find "$SCRIPT_DIR/frontend" -maxdepth 1 -type f \
\( -name "*.json" -o -name "*.ts" -o -name "*.js" -o -name "*.mjs" -o -name "*.html" \) \
-newer "$SCRIPT_DIR/frontend/dist" -print -quit 2>/dev/null)

danielhanchen added a commit that referenced this pull request Mar 18, 2026
… CPU support (#4413)

* Allow Windows setup to complete without NVIDIA GPU

setup.ps1 previously hard-exited if nvidia-smi was not found, blocking
setup entirely on CPU-only or non-NVIDIA machines. The backend already
supports CPU and MLX (Apple Silicon) in chat-only GGUF mode, and the
Linux/Mac setup.sh handles missing GPUs gracefully.

Changes:
- Convert the GPU check from a hard exit to a warning
- Guard CUDA toolkit installation behind $HasNvidiaSmi
- Install CPU-only PyTorch when no GPU is detected
- Build llama.cpp without CUDA flags when no GPU is present
- Update doc comment to reflect CPU support

* Cache frontend build across setup runs

Skip the frontend npm install + build if frontend/dist already exists.
Previously setup.ps1 nuked node_modules and package-lock.json on every
run, and both scripts always rebuilt even when dist/ was already present.

On a git clone editable install, the first setup run still builds the
frontend as before. Subsequent runs skip it, saving several minutes.
To force a rebuild, delete frontend/dist and re-run setup.

* Show pip progress for PyTorch download on Windows

The torch CUDA wheel is ~2.8 GB and the CPU wheel is ~300 MB. With
| Out-Null suppressing all output, the install appeared completely
frozen with no feedback. Remove | Out-Null for the torch install
lines so pip's download progress bar is visible. Add a size hint
so users know the download is expected to take a while.

Also moves the Triton success message inside the GPU branch so it
only prints when Triton was actually installed.

* Guard CUDA env re-sanitization behind GPU check in llama.cpp build

The CUDA_PATH re-sanitization block (lines 1020-1033) references
$CudaToolkitRoot which is only set when $HasNvidiaSmi is true and
the CUDA Toolkit section runs. On CPU-only machines, $CudaToolkitRoot
is null, causing Split-Path to throw:

  Split-Path : Cannot bind argument to parameter 'Path' because it is null.

Wrap the entire block in `if ($HasNvidiaSmi -and $CudaToolkitRoot)`.

* Rebuild frontend when source files are newer than dist/

Instead of only checking if dist/ exists, compare source file timestamps
against the dist/ directory. If any file in frontend/src/ is newer than
dist/, trigger a rebuild. This handles the case where a developer pulls
new frontend changes and re-runs setup -- stale assets get rebuilt
automatically.

* Fix cmake not found on Windows after winget install

Two issues fixed:

1. After winget installs cmake, Refresh-Environment may not pick up the
   new PATH entry (MSI PATH changes sometimes need a new shell). Added a
   fallback that probes cmake's default install locations (Program Files,
   LocalAppData) and adds the directory to PATH explicitly if found.

2. If cmake is still unavailable when the llama.cpp build starts (e.g.
   winget failed silently or PATH was not updated), the build now skips
   gracefully with a [SKIP] warning instead of crashing with
   "cmake : The term 'cmake' is not recognized".

* Fix frontend rebuild detection and decouple oxc-validator install

Address review feedback:

- Check entire frontend/ directory for changes, not just src/.
  The build also depends on package.json, vite.config.ts,
  tailwind.config.ts, public/, and other config files. A change
  to any of these now triggers a rebuild.
- Move oxc-validator npm install outside the frontend build gate
  in setup.sh so it always runs on setup, matching setup.ps1
  which already had it outside the gate.

* Show cmake errors on failure and retry CUDA VS integration with elevation

Two fixes for issue #4405 (Windows setup fails at cmake configure):

1. cmake configure: capture output and display it on failure instead of
   piping to Out-Null. When the error mentions "No CUDA toolset found",
   print a hint about the CUDA VS integration files.

2. CUDA VS integration copy: when the direct Copy-Item fails (needs
   admin access to write to Program Files), retry with Start-Process
   -Verb RunAs to prompt for elevation. This is the root cause of the
   "No CUDA toolset found" cmake failure -- the .targets files that let
   MSBuild compile .cu files are missing from the VS BuildCustomizations
   directory.

* Address reviewer feedback: cmake PATH persistence, stale cache, torch error check

1. Persist cmake PATH to user registry so Refresh-Environment cannot
   drop it later in the same setup run. Previously the process-only
   PATH addition at phase 1 could vanish when Refresh-Environment
   rebuilt PATH from registry during phase 2/3 installs.

2. Clean stale CMake cache before configure. If a previous run built
   with CUDA and the user reruns without a GPU (or vice versa), the
   cached GGML_CUDA value would persist. Now the build dir is removed
   before configure.

3. Explicitly set -DGGML_CUDA=OFF for CPU-only builds instead of just
   omitting CUDA flags. This prevents cmake from auto-detecting a
   partial CUDA installation.

4. Fix CUDA cmake flag indentation -- was misaligned from the original
   PR, now consistently indented inside the if/else block.

5. Fail hard if pip install torch returns a non-zero exit code instead
   of silently continuing with a broken environment.

* Remove extra CUDA cmake flags to align Windows with Linux build

Drop GGML_CUDA_FA_ALL_QUANTS, GGML_CUDA_F16, GGML_CUDA_GRAPHS,
GGML_CUDA_FORCE_CUBLAS, and GGML_CUDA_PEER_MAX_BATCH_SIZE flags.
The Linux build in setup.sh only sets GGML_CUDA=ON and lets llama.cpp
use its defaults for everything else. Keep Windows consistent.

* Address reviewer round 2: GPU probe fallback, Triton check, stale binary rebuild

1. GPU detection: fallback to default nvidia-smi install locations
   (Program Files\NVIDIA Corporation\NVSMI, System32) when nvidia-smi
   is not on PATH. Prevents silent CPU-only provisioning on machines
   that have a GPU but a broken PATH.

2. Triton: check $LASTEXITCODE after pip install and print [WARN]
   on failure instead of unconditional [OK].

3. Stale llama-server: check CMakeCache.txt for GGML_CUDA setting
   and rebuild if the existing binary does not match the current GPU
   mode (e.g. CUDA binary on a now-CPU-only rerun, or vice versa).

* Fix frontend rebuild detection and npm dependency issues

Addresses reviewer feedback on the frontend caching logic:

1. setup.sh: Fix broken find command that caused exit under pipefail.
   The piped `find | xargs find -newer` had paths after the expression
   which GNU find rejects. Replaced with a simpler `find -maxdepth 1
   -type f -newer dist/` that checks ALL top-level files (catches
   index.html, bun.lock, etc. that the extension allowlist missed).

2. setup.sh: Guard oxc-validator npm install behind `command -v npm`
   check. When the frontend build is skipped (dist/ is cached), Node
   bootstrap is also skipped, so npm may not be available.

3. setup.ps1: Replace Get-ChildItem -Include with explicit path
   probing for src/ and public/. PowerShell's -Include without a
   trailing wildcard silently returns nothing, so src/public changes
   were never detected. Also check ALL top-level files instead of
   just .json/.ts/.js/.mjs extensions.

* Fix studio setup: venv isolation, centralized .venv_t5, uv targeting

- All platforms (including Colab) now create ~/.unsloth/studio/.venv
  with --without-pip fallback for broken ensurepip environments
- Add --python sys.executable to uv pip install in install_python_stack.py
  so uv targets the correct venv instead of system Python
- Centralize .venv_t5 bootstrap in transformers_version.py with proper
  validation (checks required packages exist, not just non-empty dir)
- Replace ~150 lines of duplicated install code across 3 worker files
  with calls to the shared _ensure_venv_t5_exists() helper
- Use uv-if-present with pip fallback; do not install uv at runtime
- Add site.addsitedir() shim in colab.py so notebook cells can import
  studio packages from the venv without system-Python double-install
- Update .venv_t5 packages: huggingface_hub 1.3.0->1.7.1, add hf_xet
- Bump transformers pin 4.57.1->4.57.6 in requirements + constraints
- Add Fast-Install helper to setup.ps1 with uv+pip fallback
- Keep Colab-specific completion banner in setup.sh

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix nvidia-smi PATH persistence and cmake requirement for CPU-only

1. Store nvidia-smi as an absolute path ($NvidiaSmiExe) on first
   detection. All later calls (Get-CudaComputeCapability,
   Get-PytorchCudaTag, CUDA toolkit detection) use this absolute
   path instead of relying on PATH. This survives Refresh-Environment
   which rebuilds PATH from the registry and drops process-only
   additions.

2. Make cmake fatal for CPU-only installs. CPU-only machines depend
   entirely on llama-server for GGUF chat mode, so reporting "Setup
   Complete!" without it is misleading. GPU machines can still skip
   the llama-server build since they have other inference paths.

* Fix broken frontend freshness detection in setup scripts

- setup.sh: Replace broken `find | xargs find -newer` pipeline with
  single `find ... -newer` call. The old pipeline produced "paths must
  precede expression" errors (silently suppressed by 2>/dev/null),
  causing top-level config changes to never trigger a rebuild.
- setup.sh: Add `command -v npm` guard to oxc-validator block so it
  does not fail when Node was not installed (build-skip path).
- setup.ps1: Replace `Get-ChildItem -Include` (unreliable without
  -Recurse on PS 5.1) with explicit directory paths for src/ and
  public/ scanning.
- Both: Add *.html to tracked file patterns so index.html (Vite
  entry point) changes trigger a rebuild.
- Both: Use -print -quit instead of piping to head -1 for efficiency.

* Fix bugs found during review of PRs #4404, #4400, #4399

- setup.sh: Add || true guard to find command that checks frontend/src
  and frontend/public dirs, preventing script abort under set -euo
  pipefail when either directory is missing

- colab.py: Use sys.path.insert(0, ...) instead of site.addsitedir()
  so Studio venv packages take priority over system copies. Add warning
  when venv is missing instead of silently failing.

- transformers_version.py: _venv_t5_is_valid() now checks installed
  package versions via .dist-info metadata, not just directory presence.
  Prevents false positives from stale or wrong-version packages.

- transformers_version.py: _install_to_venv_t5() now passes --upgrade
  so pip replaces existing stale packages in the target directory.

- setup.ps1: CPU-only PyTorch install uses --index-url for cpu wheel
  and all install commands use Fast-Install (uv with pip fallback).

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix _venv_t5_is_valid dist-info loop exiting after first directory

Remove premature break that caused the loop over .dist-info directories
to exit after the first match even if it had no METADATA file. Now
continues iterating until a valid METADATA is found or all dirs are
exhausted.

* Capture error output on failure instead of discarding with Out-Null

setup.ps1: 6 locations changed from `| Out-Null` to `| Out-String` with
output shown on failure -- PyTorch GPU/CPU install, Triton install,
venv_t5 package loop, cmake llama-server and llama-quantize builds.

transformers_version.py: clean stale .venv_t5 directory before reinstall
when validation detects missing or version-mismatched packages.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix ModuleNotFoundError when CLI imports studio.backend.core

The backend uses bare "from utils.*" imports everywhere, relying on
backend/ being on sys.path. Workers and routes add it at startup, but
the CLI imports studio.backend.core as a package -- backend/ was never
added. Add sys.path setup at the top of core/__init__.py so lazy
imports resolve correctly regardless of entry point.

Fixes: unsloth inference unsloth/Qwen3-8B "who are you" crashing with
"No module named 'utils'"

* Fix frontend freshness check to detect all top-level file changes

The extension allowlist (*.json, *.ts, *.js, *.mjs, *.html) missed
files like bun.lock, so lockfile-only dependency changes could skip
the frontend rebuild. Check all top-level files instead.

* Add tiktoken to .venv_t5 for Qwen-family tokenizers

Qwen models use tiktoken-based tokenizers which fail when routed through
the transformers 5.x overlay without tiktoken installed. Add it to the
setup scripts (with deps for Windows) and runtime fallback list.

Integrates PR #4418.

* Fix tiktoken crash in _venv_t5_is_valid and stray brace in setup.ps1

_venv_t5_is_valid() crashed with ValueError on unpinned packages like
"tiktoken" (no ==version). Handle by splitting safely and skipping
version check for unpinned packages (existence check only).

Also remove stray closing brace in setup.ps1 tiktoken install block.

---------

Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@rolandtannous
Copy link
Copy Markdown
Contributor

@danielhanchen
unsloth studio setup is only run as a one time setup step or everytime there is an update to the package after we pip install it. The frontend needs to be rebuilt to pick up any changes being introduced as part of the setup process.

If we cache the frontend and don't rebuild it , the frontend won't pickup the new changes.

danielhanchen added a commit that referenced this pull request Mar 18, 2026
…_BUILD=true (#4427)

* revert: remove frontend build caching from setup scripts

The mtime-based caching introduced in #4404/#4413 can incorrectly skip
frontend builds -- e.g. after git pull when filesystem timestamps are
not preserved, or after our Tailwind v4 discovery that the site-packages
.gitignore must be hidden before vite build (which the cached path
doesn't handle).

Always rebuild the frontend on setup. The build takes ~15s and is
safer than risking a stale dist/.

* revert: disable frontend build caching, keep code commented out

Caching disabled by always setting _NEED_FRONTEND_BUILD=true.
The mtime-based logic is preserved in comments for future re-enabling.

Reasons for disabling:
- Git does not preserve file timestamps, so cached dist/ can appear
  newer than freshly checked-out source after a pull
- Tailwind v4 requires hiding site-packages/.gitignore before vite
  build; the cache path bypasses this, producing broken CSS

* revert: always rebuild frontend, remove mtime caching

* revert: always rebuild frontend, override caching with _NEED_FRONTEND_BUILD=true
shibizhao pushed a commit to shibizhao/unsloth-npu that referenced this pull request Apr 7, 2026
… CPU support (unslothai#4413)

* Allow Windows setup to complete without NVIDIA GPU

setup.ps1 previously hard-exited if nvidia-smi was not found, blocking
setup entirely on CPU-only or non-NVIDIA machines. The backend already
supports CPU and MLX (Apple Silicon) in chat-only GGUF mode, and the
Linux/Mac setup.sh handles missing GPUs gracefully.

Changes:
- Convert the GPU check from a hard exit to a warning
- Guard CUDA toolkit installation behind $HasNvidiaSmi
- Install CPU-only PyTorch when no GPU is detected
- Build llama.cpp without CUDA flags when no GPU is present
- Update doc comment to reflect CPU support

* Cache frontend build across setup runs

Skip the frontend npm install + build if frontend/dist already exists.
Previously setup.ps1 nuked node_modules and package-lock.json on every
run, and both scripts always rebuilt even when dist/ was already present.

On a git clone editable install, the first setup run still builds the
frontend as before. Subsequent runs skip it, saving several minutes.
To force a rebuild, delete frontend/dist and re-run setup.

* Show pip progress for PyTorch download on Windows

The torch CUDA wheel is ~2.8 GB and the CPU wheel is ~300 MB. With
| Out-Null suppressing all output, the install appeared completely
frozen with no feedback. Remove | Out-Null for the torch install
lines so pip's download progress bar is visible. Add a size hint
so users know the download is expected to take a while.

Also moves the Triton success message inside the GPU branch so it
only prints when Triton was actually installed.

* Guard CUDA env re-sanitization behind GPU check in llama.cpp build

The CUDA_PATH re-sanitization block (lines 1020-1033) references
$CudaToolkitRoot which is only set when $HasNvidiaSmi is true and
the CUDA Toolkit section runs. On CPU-only machines, $CudaToolkitRoot
is null, causing Split-Path to throw:

  Split-Path : Cannot bind argument to parameter 'Path' because it is null.

Wrap the entire block in `if ($HasNvidiaSmi -and $CudaToolkitRoot)`.

* Rebuild frontend when source files are newer than dist/

Instead of only checking if dist/ exists, compare source file timestamps
against the dist/ directory. If any file in frontend/src/ is newer than
dist/, trigger a rebuild. This handles the case where a developer pulls
new frontend changes and re-runs setup -- stale assets get rebuilt
automatically.

* Fix cmake not found on Windows after winget install

Two issues fixed:

1. After winget installs cmake, Refresh-Environment may not pick up the
   new PATH entry (MSI PATH changes sometimes need a new shell). Added a
   fallback that probes cmake's default install locations (Program Files,
   LocalAppData) and adds the directory to PATH explicitly if found.

2. If cmake is still unavailable when the llama.cpp build starts (e.g.
   winget failed silently or PATH was not updated), the build now skips
   gracefully with a [SKIP] warning instead of crashing with
   "cmake : The term 'cmake' is not recognized".

* Fix frontend rebuild detection and decouple oxc-validator install

Address review feedback:

- Check entire frontend/ directory for changes, not just src/.
  The build also depends on package.json, vite.config.ts,
  tailwind.config.ts, public/, and other config files. A change
  to any of these now triggers a rebuild.
- Move oxc-validator npm install outside the frontend build gate
  in setup.sh so it always runs on setup, matching setup.ps1
  which already had it outside the gate.

* Show cmake errors on failure and retry CUDA VS integration with elevation

Two fixes for issue unslothai#4405 (Windows setup fails at cmake configure):

1. cmake configure: capture output and display it on failure instead of
   piping to Out-Null. When the error mentions "No CUDA toolset found",
   print a hint about the CUDA VS integration files.

2. CUDA VS integration copy: when the direct Copy-Item fails (needs
   admin access to write to Program Files), retry with Start-Process
   -Verb RunAs to prompt for elevation. This is the root cause of the
   "No CUDA toolset found" cmake failure -- the .targets files that let
   MSBuild compile .cu files are missing from the VS BuildCustomizations
   directory.

* Address reviewer feedback: cmake PATH persistence, stale cache, torch error check

1. Persist cmake PATH to user registry so Refresh-Environment cannot
   drop it later in the same setup run. Previously the process-only
   PATH addition at phase 1 could vanish when Refresh-Environment
   rebuilt PATH from registry during phase 2/3 installs.

2. Clean stale CMake cache before configure. If a previous run built
   with CUDA and the user reruns without a GPU (or vice versa), the
   cached GGML_CUDA value would persist. Now the build dir is removed
   before configure.

3. Explicitly set -DGGML_CUDA=OFF for CPU-only builds instead of just
   omitting CUDA flags. This prevents cmake from auto-detecting a
   partial CUDA installation.

4. Fix CUDA cmake flag indentation -- was misaligned from the original
   PR, now consistently indented inside the if/else block.

5. Fail hard if pip install torch returns a non-zero exit code instead
   of silently continuing with a broken environment.

* Remove extra CUDA cmake flags to align Windows with Linux build

Drop GGML_CUDA_FA_ALL_QUANTS, GGML_CUDA_F16, GGML_CUDA_GRAPHS,
GGML_CUDA_FORCE_CUBLAS, and GGML_CUDA_PEER_MAX_BATCH_SIZE flags.
The Linux build in setup.sh only sets GGML_CUDA=ON and lets llama.cpp
use its defaults for everything else. Keep Windows consistent.

* Address reviewer round 2: GPU probe fallback, Triton check, stale binary rebuild

1. GPU detection: fallback to default nvidia-smi install locations
   (Program Files\NVIDIA Corporation\NVSMI, System32) when nvidia-smi
   is not on PATH. Prevents silent CPU-only provisioning on machines
   that have a GPU but a broken PATH.

2. Triton: check $LASTEXITCODE after pip install and print [WARN]
   on failure instead of unconditional [OK].

3. Stale llama-server: check CMakeCache.txt for GGML_CUDA setting
   and rebuild if the existing binary does not match the current GPU
   mode (e.g. CUDA binary on a now-CPU-only rerun, or vice versa).

* Fix frontend rebuild detection and npm dependency issues

Addresses reviewer feedback on the frontend caching logic:

1. setup.sh: Fix broken find command that caused exit under pipefail.
   The piped `find | xargs find -newer` had paths after the expression
   which GNU find rejects. Replaced with a simpler `find -maxdepth 1
   -type f -newer dist/` that checks ALL top-level files (catches
   index.html, bun.lock, etc. that the extension allowlist missed).

2. setup.sh: Guard oxc-validator npm install behind `command -v npm`
   check. When the frontend build is skipped (dist/ is cached), Node
   bootstrap is also skipped, so npm may not be available.

3. setup.ps1: Replace Get-ChildItem -Include with explicit path
   probing for src/ and public/. PowerShell's -Include without a
   trailing wildcard silently returns nothing, so src/public changes
   were never detected. Also check ALL top-level files instead of
   just .json/.ts/.js/.mjs extensions.

* Fix studio setup: venv isolation, centralized .venv_t5, uv targeting

- All platforms (including Colab) now create ~/.unsloth/studio/.venv
  with --without-pip fallback for broken ensurepip environments
- Add --python sys.executable to uv pip install in install_python_stack.py
  so uv targets the correct venv instead of system Python
- Centralize .venv_t5 bootstrap in transformers_version.py with proper
  validation (checks required packages exist, not just non-empty dir)
- Replace ~150 lines of duplicated install code across 3 worker files
  with calls to the shared _ensure_venv_t5_exists() helper
- Use uv-if-present with pip fallback; do not install uv at runtime
- Add site.addsitedir() shim in colab.py so notebook cells can import
  studio packages from the venv without system-Python double-install
- Update .venv_t5 packages: huggingface_hub 1.3.0->1.7.1, add hf_xet
- Bump transformers pin 4.57.1->4.57.6 in requirements + constraints
- Add Fast-Install helper to setup.ps1 with uv+pip fallback
- Keep Colab-specific completion banner in setup.sh

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix nvidia-smi PATH persistence and cmake requirement for CPU-only

1. Store nvidia-smi as an absolute path ($NvidiaSmiExe) on first
   detection. All later calls (Get-CudaComputeCapability,
   Get-PytorchCudaTag, CUDA toolkit detection) use this absolute
   path instead of relying on PATH. This survives Refresh-Environment
   which rebuilds PATH from the registry and drops process-only
   additions.

2. Make cmake fatal for CPU-only installs. CPU-only machines depend
   entirely on llama-server for GGUF chat mode, so reporting "Setup
   Complete!" without it is misleading. GPU machines can still skip
   the llama-server build since they have other inference paths.

* Fix broken frontend freshness detection in setup scripts

- setup.sh: Replace broken `find | xargs find -newer` pipeline with
  single `find ... -newer` call. The old pipeline produced "paths must
  precede expression" errors (silently suppressed by 2>/dev/null),
  causing top-level config changes to never trigger a rebuild.
- setup.sh: Add `command -v npm` guard to oxc-validator block so it
  does not fail when Node was not installed (build-skip path).
- setup.ps1: Replace `Get-ChildItem -Include` (unreliable without
  -Recurse on PS 5.1) with explicit directory paths for src/ and
  public/ scanning.
- Both: Add *.html to tracked file patterns so index.html (Vite
  entry point) changes trigger a rebuild.
- Both: Use -print -quit instead of piping to head -1 for efficiency.

* Fix bugs found during review of PRs unslothai#4404, unslothai#4400, unslothai#4399

- setup.sh: Add || true guard to find command that checks frontend/src
  and frontend/public dirs, preventing script abort under set -euo
  pipefail when either directory is missing

- colab.py: Use sys.path.insert(0, ...) instead of site.addsitedir()
  so Studio venv packages take priority over system copies. Add warning
  when venv is missing instead of silently failing.

- transformers_version.py: _venv_t5_is_valid() now checks installed
  package versions via .dist-info metadata, not just directory presence.
  Prevents false positives from stale or wrong-version packages.

- transformers_version.py: _install_to_venv_t5() now passes --upgrade
  so pip replaces existing stale packages in the target directory.

- setup.ps1: CPU-only PyTorch install uses --index-url for cpu wheel
  and all install commands use Fast-Install (uv with pip fallback).

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix _venv_t5_is_valid dist-info loop exiting after first directory

Remove premature break that caused the loop over .dist-info directories
to exit after the first match even if it had no METADATA file. Now
continues iterating until a valid METADATA is found or all dirs are
exhausted.

* Capture error output on failure instead of discarding with Out-Null

setup.ps1: 6 locations changed from `| Out-Null` to `| Out-String` with
output shown on failure -- PyTorch GPU/CPU install, Triton install,
venv_t5 package loop, cmake llama-server and llama-quantize builds.

transformers_version.py: clean stale .venv_t5 directory before reinstall
when validation detects missing or version-mismatched packages.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix ModuleNotFoundError when CLI imports studio.backend.core

The backend uses bare "from utils.*" imports everywhere, relying on
backend/ being on sys.path. Workers and routes add it at startup, but
the CLI imports studio.backend.core as a package -- backend/ was never
added. Add sys.path setup at the top of core/__init__.py so lazy
imports resolve correctly regardless of entry point.

Fixes: unsloth inference unsloth/Qwen3-8B "who are you" crashing with
"No module named 'utils'"

* Fix frontend freshness check to detect all top-level file changes

The extension allowlist (*.json, *.ts, *.js, *.mjs, *.html) missed
files like bun.lock, so lockfile-only dependency changes could skip
the frontend rebuild. Check all top-level files instead.

* Add tiktoken to .venv_t5 for Qwen-family tokenizers

Qwen models use tiktoken-based tokenizers which fail when routed through
the transformers 5.x overlay without tiktoken installed. Add it to the
setup scripts (with deps for Windows) and runtime fallback list.

Integrates PR unslothai#4418.

* Fix tiktoken crash in _venv_t5_is_valid and stray brace in setup.ps1

_venv_t5_is_valid() crashed with ValueError on unpinned packages like
"tiktoken" (no ==version). Handle by splitting safely and skipping
version check for unpinned packages (existence check only).

Also remove stray closing brace in setup.ps1 tiktoken install block.

---------

Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
shibizhao pushed a commit to shibizhao/unsloth-npu that referenced this pull request Apr 7, 2026
…_BUILD=true (unslothai#4427)

* revert: remove frontend build caching from setup scripts

The mtime-based caching introduced in unslothai#4404/unslothai#4413 can incorrectly skip
frontend builds -- e.g. after git pull when filesystem timestamps are
not preserved, or after our Tailwind v4 discovery that the site-packages
.gitignore must be hidden before vite build (which the cached path
doesn't handle).

Always rebuild the frontend on setup. The build takes ~15s and is
safer than risking a stale dist/.

* revert: disable frontend build caching, keep code commented out

Caching disabled by always setting _NEED_FRONTEND_BUILD=true.
The mtime-based logic is preserved in comments for future re-enabling.

Reasons for disabling:
- Git does not preserve file timestamps, so cached dist/ can appear
  newer than freshly checked-out source after a pull
- Tailwind v4 requires hiding site-packages/.gitignore before vite
  build; the cache path bypasses this, producing broken CSS

* revert: always rebuild frontend, remove mtime caching

* revert: always rebuild frontend, override caching with _NEED_FRONTEND_BUILD=true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants